add FanOutMapper for one-to-many partition fan-out by Lee-W · Pull Request #66030 · apache/airflow

Lee-W · 2026-04-28T15:26:14Z

Why

Roll-up partitioning currently handles N→1 (many upstream keys feeding one downstream run), but the symmetric 1→N case — one weekly key fanning out into seven daily Dag runs — has no first-class mapper. Authors had to hand-roll fan-out logic per Dag, which made the unbounded-keys footgun easy to hit.

closes: #65654

What

Add FanOutMapper composing upstream_mapper + window + downstream_mapper, mirroring RollupMapper's shape.
Resolve the default downstream mapper from the Window class name so SDK and core Window types both map to the same default (WeekWindow → StartOfDayMapper, etc.).
Add [scheduler] partition_fanout_max_keys (default 1000) capping downstream keys per upstream event; over-limit fan-outs are skipped and logged against the source task instance.
Wire FanOutMapper through assets/manager.py and serialization/encoders.py.

Was generative AI tooling used to co-author this PR?

Yes (please specify the tool below)

Generated-by: [Claude] following the guidelines

Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
When adding dependency, check compliance with the ASF 3rd Party License Policy.
For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

Lee-W · 2026-04-28T15:26:35Z

only the last commit matters.

this is still not ideal, but at least it's not super wrong now

…t ordering StartOfWeekMapper and StartOfQuarterMapper now derive their decode_downstream regex from output_format itself, so users can re-order strftime directives and {name} placeholders (e.g. "Q{quarter}/%Y") without having to override decode_downstream. Malformed output_format — empty {}, non-identifier placeholder names, duplicate %X directives, duplicate {name} placeholders — raises ValueError at mapper construction instead of an opaque re.error from deep inside a scheduler tick or UI route.

…ag_runs list Drop the SQL "count distinct assets with any log" subquery and always compute total_received via the Python rollup-aware helper. The list endpoint previously returned different numbers for the same APDR depending on whether the caller filtered by dag_id (rollup-aware, counts upstream window keys) or queried globally (SQL approximation, counts assets with any log) — same field, different semantics, very confusing for any UI consumer. The N+1 cost of per-Dag timetable loads was already paid in the global branch for total_required, so adding a single batched log fetch keeps the existing query budget while making the contract identical across both views. _compute_received_count now skips asset_ids that are no longer required (active=False) so the relaxed log query doesn't over-count.

StartOfWeekMapper now always uses ISO weeks (Monday) and StartOfMonthMapper always emits the 1st of the month. Custom fiscal boundaries can still be expressed by pairing a user-defined source mapper with the existing windows.

The next_run_assets and partitioned_dag_runs endpoints used to load and deserialize the full timetable on every request just to read mapper attributes (is_rollup) and required-key counts. Cache mapper metadata per asset on DagModel during Dag sync via a new ``partition_mapper_info`` JSON column, so the UI resolves mapper attributes from the cache and only loads the timetable when ``to_upstream`` evaluation for rollup mappers is actually needed.

Composes upstream_mapper + window + (optional) fine_mapper, symmetric to RollupMapper. New [scheduler] partition_fanout_max_keys caps the downstream keys per upstream event.

boring-cyborg Bot added area:API Airflow's REST/HTTP API area:ConfigTemplates area:DAG-processing area:Scheduler including HA (high availability) scheduler area:task-sdk area:UI Related to UI/UX. For Frontend Developers. labels Apr 28, 2026

Lee-W force-pushed the partition-fanout branch 17 times, most recently from e4d0931 to dd3408e Compare May 8, 2026 08:43

Lee-W added 6 commits May 8, 2026 20:30

feat(AIP-76): window

35e1002

perf: simplify SQL queries

87c2ff0

feat(ui): add one to many partition key support to UI

ddcd24a

this is still not ideal, but at least it's not super wrong now

feat(ui): update UI for to_upstream

8c2a7e8

docs: add example Dag for window partition case

3e24fac

feat: improve UX for both roll up and non-rollup cases

39d1b2c

Lee-W added 11 commits May 8, 2026 20:30

refactor: simplify event data handling

209ff68

feat: Add RollUpMapper and remove other outdated mappers

1d1b75e

fix: harden rollup mapper edge cases (boundary, error suppression, UI)

7929264

feat: write audit log when rollup mapper evaluation fails

5506d36

refactor: compile temporal mapper key pattern eagerly in __init__

846c0d8

refactor: use timetable.partitioned property for partition checks

c776f38

feat(AIP-76): add FanOutMapper for one-to-many partition fan-out

3b5dee1

Composes upstream_mapper + window + (optional) fine_mapper, symmetric to RollupMapper. New [scheduler] partition_fanout_max_keys caps the downstream keys per upstream event.

Lee-W force-pushed the partition-fanout branch from dd3408e to 3b5dee1 Compare May 8, 2026 12:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add FanOutMapper for one-to-many partition fan-out#66030

add FanOutMapper for one-to-many partition fan-out#66030
Lee-W wants to merge 17 commits intoapache:mainfrom
astronomer:partition-fanout

Lee-W commented Apr 28, 2026 •

edited

Loading

Uh oh!

Lee-W commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Lee-W commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

Was generative AI tooling used to co-author this PR?

Uh oh!

Lee-W commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Lee-W commented Apr 28, 2026 •

edited

Loading